127 research outputs found

    A Universal Parallel Two-Pass MDL Context Tree Compression Algorithm

    Full text link
    Computing problems that handle large amounts of data necessitate the use of lossless data compression for efficient storage and transmission. We present a novel lossless universal data compression algorithm that uses parallel computational units to increase the throughput. The length-NN input sequence is partitioned into BB blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of BB, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) context tree source underlying the entire input, and then encode each of the BB blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is O(N/B)O(N/B). Its redundancy is approximately Blog(N/B)B\log(N/B) bits above Rissanen's lower bound on universal compression performance, with respect to any context tree source whose maximal depth is at most log(N/B)\log(N/B). We improve the compression by using different quantizers for states of the context tree based on the number of symbols corresponding to those states. Numerical results from a prototype implementation suggest that our algorithm offers a better trade-off between compression and throughput than competing universal data compression algorithms.Comment: Accepted to Journal of Selected Topics in Signal Processing special issue on Signal Processing for Big Data (expected publication date June 2015). 10 pages double column, 6 figures, and 2 tables. arXiv admin note: substantial text overlap with arXiv:1405.6322. Version: Mar 2015: Corrected a typ

    A Parallel Two-Pass MDL Context Tree Algorithm for Universal Source Coding

    Full text link
    We present a novel lossless universal source coding algorithm that uses parallel computational units to increase the throughput. The length-NN input sequence is partitioned into BB blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of BB, but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) source underlying the entire input, and then encode each of the BB blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is O(N/B)O(N/B). Its redundancy is approximately Blog(N/B)B\log(N/B) bits above Rissanen's lower bound on universal coding performance, with respect to any tree source whose maximal depth is at most log(N/B)\log(N/B)

    Empirical Bayes and Full Bayes for Signal Estimation

    Full text link
    We consider signals that follow a parametric distribution where the parameter values are unknown. To estimate such signals from noisy measurements in scalar channels, we study the empirical performance of an empirical Bayes (EB) approach and a full Bayes (FB) approach. We then apply EB and FB to solve compressed sensing (CS) signal estimation problems by successively denoising a scalar Gaussian channel within an approximate message passing (AMP) framework. Our numerical results show that FB achieves better performance than EB in scalar channel denoising problems when the signal dimension is small. In the CS setting, the signal dimension must be large enough for AMP to work well; for large signal dimensions, AMP has similar performance with FB and EB.Comment: This work was presented at the Information Theory and Application workshop (ITA), San Diego, CA, Feb. 201

    A Study on the Impact of Locality in the Decoding of Binary Cyclic Codes

    Full text link
    In this paper, we study the impact of locality on the decoding of binary cyclic codes under two approaches, namely ordered statistics decoding (OSD) and trellis decoding. Given a binary cyclic code having locality or availability, we suitably modify the OSD to obtain gains in terms of the Signal-To-Noise ratio, for a given reliability and essentially the same level of decoder complexity. With regard to trellis decoding, we show that careful introduction of locality results in the creation of cyclic subcodes having lower maximum state complexity. We also present a simple upper-bounding technique on the state complexity profile, based on the zeros of the code. Finally, it is shown how the decoding speed can be significantly increased in the presence of locality, in the moderate-to-high SNR regime, by making use of a quick-look decoder that often returns the ML codeword.Comment: Extended version of a paper submitted to ISIT 201

    Rate-Optimal Streaming Codes for Channels with Burst and Isolated Erasures

    Full text link
    Recovery of data packets from packet erasures in a timely manner is critical for many streaming applications. An early paper by Martinian and Sundberg introduced a framework for streaming codes and designed rate-optimal codes that permit delay-constrained recovery from an erasure burst of length up to BB. A recent work by Badr et al. extended this result and introduced a sliding-window channel model C(N,B,W)\mathcal{C}(N,B,W). Under this model, in a sliding-window of width WW, one of the following erasure patterns are possible (i) a burst of length at most BB or (ii) at most NN (possibly non-contiguous) arbitrary erasures. Badr et al. obtained a rate upper bound for streaming codes that can recover with a time delay TT, from any erasure patterns permissible under the C(N,B,W)\mathcal{C}(N,B,W) model. However, constructions matching the bound were absent, except for a few parameter sets. In this paper, we present an explicit family of codes that achieves the rate upper bound for all feasible parameters NN, BB, WW and TT.Comment: shorter version submitted to ISIT 201

    Sequential Gradient Coding For Straggler Mitigation

    Full text link
    In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation in the presence of stragglers. In this paper, we consider the distributed computation of a sequence of gradients {g(1),g(2),,g(J)}\{g(1),g(2),\ldots,g(J)\}, where processing of each gradient g(t)g(t) starts in round-tt and finishes by round-(t+T)(t+T). Here T0T\geq 0 denotes a delay parameter. For the GC scheme, coding is only across computing nodes and this results in a solution where T=0T=0. On the other hand, having T>0T>0 allows for designing schemes which exploit the temporal dimension as well. In this work, we propose two schemes that demonstrate improved performance compared to GC. Our first scheme combines GC with selective repetition of previously unfinished tasks and achieves improved straggler mitigation. In our second scheme, which constitutes our main contribution, we apply GC to a subset of the tasks and repetition for the remainder of the tasks. We then multiplex these two classes of tasks across workers and rounds in an adaptive manner, based on past straggler patterns. Using theoretical analysis, we demonstrate that our second scheme achieves significant reduction in the computational load. In our experiments, we study a practical setting of concurrently training multiple neural networks over an AWS Lambda cluster involving 256 worker nodes, where our framework naturally applies. We demonstrate that the latter scheme can yield a 16\% improvement in runtime over the baseline GC scheme, in the presence of naturally occurring, non-simulated stragglers
    corecore